Overview

Dataset statistics

Number of variables18
Number of observations4207977
Missing cells17732808
Missing cells (%)23.4%
Duplicate rows13451
Duplicate rows (%)0.3%
Total size in memory2.7 GiB
Average record size in memory684.6 B

Variable types

Numeric7
DateTime2
Unsupported2
Text3
Categorical4

Alerts

Dataset has 13451 (0.3%) duplicate rowsDuplicates
trip_duration has 2517155 (59.8%) missing valuesMissing
bike_id has 2517155 (59.8%) missing valuesMissing
user_type has 2517652 (59.8%) missing valuesMissing
birth_year has 2560677 (60.9%) missing valuesMissing
gender has 2517155 (59.8%) missing valuesMissing
ride_id has 1690822 (40.2%) missing valuesMissing
rideable_type has 1690822 (40.2%) missing valuesMissing
member_casual has 1690822 (40.2%) missing valuesMissing
trip_duration is highly skewed (γ1 = 575.7622365)Skewed
end_station_latitude is highly skewed (γ1 = -73.00992481)Skewed
end_station_longitude is highly skewed (γ1 = 73.04144444)Skewed
start_station_id is an unsupported type, check if it needs cleaning or further analysisUnsupported
end_station_id is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-01-21 19:17:35.025783
Analysis finished2024-01-21 19:22:14.376943
Duration4 minutes and 39.35 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

trip_duration
Real number (ℝ)

MISSING  SKEWED 

Distinct15182
Distinct (%)0.9%
Missing2517155
Missing (%)59.8%
Infinite0
Infinite (%)0.0%
Mean934.3995
Minimum61
Maximum20260211
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size64.2 MiB
2024-01-21T14:22:14.675922image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum61
5-th percentile139
Q1245
median383
Q3694
95-th percentile2238
Maximum20260211
Range20260150
Interquartile range (IQR)449

Descriptive statistics

Standard deviation23900.616
Coefficient of variation (CV)25.578584
Kurtosis437929.26
Mean934.3995
Median Absolute Deviation (MAD)173
Skewness575.76224
Sum1.5799032 × 109
Variance5.7123944 × 108
MonotonicityNot monotonic
2024-01-21T14:22:14.993865image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
244 3951
 
0.1%
246 3930
 
0.1%
231 3903
 
0.1%
235 3896
 
0.1%
256 3895
 
0.1%
238 3893
 
0.1%
239 3888
 
0.1%
243 3884
 
0.1%
240 3883
 
0.1%
252 3876
 
0.1%
Other values (15172) 1651823
39.3%
(Missing) 2517155
59.8%
ValueCountFrequency (%)
61 280
< 0.1%
62 291
< 0.1%
63 337
< 0.1%
64 360
< 0.1%
65 331
< 0.1%
66 354
< 0.1%
67 378
< 0.1%
68 353
< 0.1%
69 427
< 0.1%
70 451
< 0.1%
ValueCountFrequency (%)
20260211 1
< 0.1%
16329808 1
< 0.1%
5366099 1
< 0.1%
4826890 1
< 0.1%
3261756 1
< 0.1%
3076192 1
< 0.1%
2566420 1
< 0.1%
2423337 1
< 0.1%
2181628 1
< 0.1%
2135016 1
< 0.1%
Distinct4117220
Distinct (%)97.8%
Missing0
Missing (%)0.0%
Memory size64.2 MiB
Minimum2015-09-21 14:53:16
Maximum2023-12-31 23:59:57
2024-01-21T14:22:15.235851image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:22:15.472684image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct4118361
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size64.2 MiB
Minimum2015-09-21 14:54:17
Maximum2024-01-01 23:12:00
2024-01-21T14:22:15.744128image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:22:15.983259image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

start_station_id
Unsupported

REJECTED  UNSUPPORTED 

Missing95
Missing (%)< 0.1%
Memory size239.0 MiB
Distinct369
Distinct (%)< 0.1%
Missing95
Missing (%)< 0.1%
Memory size328.6 MiB
2024-01-21T14:22:16.453295image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length44
Median length38
Mean length16.877984
Min length4

Characters and Unicode

Total characters71020565
Distinct characters66
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique208 ?
Unique (%)< 0.1%

Sample

1st rowMarin Light Rail
2nd rowExchange Place
3rd rowExchange Place
4th rowMcGinley Square
5th rowExchange Place
ValueCountFrequency (%)
st 2490687
 
16.7%
1876230
 
12.6%
ave 496808
 
3.3%
park 483465
 
3.2%
path 434267
 
2.9%
grove 298896
 
2.0%
hudson 286197
 
1.9%
newport 260171
 
1.7%
rail 248972
 
1.7%
light 248972
 
1.7%
Other values (361) 7776735
52.2%
2024-01-21T14:22:17.198576image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10693520
 
15.1%
t 5132536
 
7.2%
r 4226406
 
6.0%
a 4208226
 
5.9%
e 3817465
 
5.4%
o 3609308
 
5.1%
n 3600063
 
5.1%
S 3102968
 
4.4%
i 3084224
 
4.3%
l 2469841
 
3.5%
Other values (56) 27076008
38.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 43791344
61.7%
Uppercase Letter 13151483
 
18.5%
Space Separator 10693520
 
15.1%
Decimal Number 1507980
 
2.1%
Other Punctuation 1346159
 
1.9%
Dash Punctuation 530077
 
0.7%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 5132536
11.7%
r 4226406
9.7%
a 4208226
9.6%
e 3817465
 
8.7%
o 3609308
 
8.2%
n 3600063
 
8.2%
i 3084224
 
7.0%
l 2469841
 
5.6%
s 1870499
 
4.3%
k 1372539
 
3.1%
Other values (15) 10400237
23.7%
Uppercase Letter
ValueCountFrequency (%)
S 3102968
23.6%
H 1557710
11.8%
P 1488032
11.3%
A 994412
 
7.6%
M 720993
 
5.5%
C 699216
 
5.3%
T 631358
 
4.8%
L 536903
 
4.1%
G 536253
 
4.1%
W 511345
 
3.9%
Other values (14) 2372293
18.0%
Decimal Number
ValueCountFrequency (%)
1 616037
40.9%
4 201647
 
13.4%
6 197622
 
13.1%
2 128546
 
8.5%
5 83554
 
5.5%
8 72844
 
4.8%
9 64758
 
4.3%
7 61192
 
4.1%
3 57816
 
3.8%
0 23964
 
1.6%
Other Punctuation
ValueCountFrequency (%)
& 1346157
> 99.9%
' 1
 
< 0.1%
. 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
10693520
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 530077
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 56942827
80.2%
Common 14077738
 
19.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 5132536
 
9.0%
r 4226406
 
7.4%
a 4208226
 
7.4%
e 3817465
 
6.7%
o 3609308
 
6.3%
n 3600063
 
6.3%
S 3102968
 
5.4%
i 3084224
 
5.4%
l 2469841
 
4.3%
s 1870499
 
3.3%
Other values (39) 21821291
38.3%
Common
ValueCountFrequency (%)
10693520
76.0%
& 1346157
 
9.6%
1 616037
 
4.4%
- 530077
 
3.8%
4 201647
 
1.4%
6 197622
 
1.4%
2 128546
 
0.9%
5 83554
 
0.6%
8 72844
 
0.5%
9 64758
 
0.5%
Other values (7) 142976
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 71020565
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10693520
 
15.1%
t 5132536
 
7.2%
r 4226406
 
6.0%
a 4208226
 
5.9%
e 3817465
 
5.4%
o 3609308
 
5.1%
n 3600063
 
5.1%
S 3102968
 
4.4%
i 3084224
 
4.3%
l 2469841
 
3.5%
Other values (56) 27076008
38.1%

start_station_latitude
Real number (ℝ)

Distinct128761
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.728049
Minimum40.678334
Maximum40.863943
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size64.2 MiB
2024-01-21T14:22:17.431150image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum40.678334
5-th percentile40.712774
Q140.719252
median40.725685
Q340.736982
95-th percentile40.749985
Maximum40.863943
Range0.18560923
Interquartile range (IQR)0.017730518

Descriptive statistics

Standard deviation0.01143156
Coefficient of variation (CV)0.00028068028
Kurtosis-0.64948803
Mean40.728049
Median Absolute Deviation (MAD)0.0079845164
Skewness0.5750787
Sum1.7138269 × 108
Variance0.00013068057
MonotonicityNot monotonic
2024-01-21T14:22:17.654102image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.71958612 239934
 
5.7%
40.72759597 144046
 
3.4%
40.7272235 120899
 
2.9%
40.7287448 111351
 
2.6%
40.71458404 92164
 
2.2%
40.7192517 80268
 
1.9%
40.7177325 79907
 
1.9%
40.7211236 78827
 
1.9%
40.7112423 78278
 
1.9%
40.72152515 74426
 
1.8%
Other values (128751) 3107877
73.9%
ValueCountFrequency (%)
40.67833407 1
 
< 0.1%
40.69187865 1
 
< 0.1%
40.69263997 317
< 0.1%
40.69531906 1
 
< 0.1%
40.6970299 520
< 0.1%
40.69865054 630
< 0.1%
40.69967915 1
 
< 0.1%
40.70068073 1
 
< 0.1%
40.70362493 1
 
< 0.1%
40.70452571 1
 
< 0.1%
ValueCountFrequency (%)
40.8639433 1
 
< 0.1%
40.8066187 1
 
< 0.1%
40.8028376 1
 
< 0.1%
40.79953315 1
 
< 0.1%
40.79118803 1
 
< 0.1%
40.78903018 9
< 0.1%
40.78179272 1
 
< 0.1%
40.78096507 1
 
< 0.1%
40.77986812 2
 
< 0.1%
40.77917727 1
 
< 0.1%

start_station_longitude
Real number (ℝ)

Distinct136915
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-74.042837
Minimum-74.096937
Maximum-73.941173
Zeros0
Zeros (%)0.0%
Negative4207977
Negative (%)100.0%
Memory size64.2 MiB
2024-01-21T14:22:17.872664image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-74.096937
5-th percentile-74.067537
Q1-74.04879
median-74.042817
Q3-74.033552
95-th percentile-74.027781
Maximum-73.941173
Range0.1557632
Interquartile range (IQR)0.015238436

Descriptive statistics

Standard deviation0.01205685
Coefficient of variation (CV)-0.00016283615
Kurtosis0.5314823
Mean-74.042837
Median Absolute Deviation (MAD)0.0081729438
Skewness-0.92699455
Sum-3.1157056 × 108
Variance0.00014536764
MonotonicityNot monotonic
2024-01-21T14:22:18.171233image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-74.04311746 239934
 
5.7%
-74.04424731 144046
 
3.4%
-74.0337589 120899
 
2.9%
-74.0321082 111351
 
2.6%
-74.04281706 92164
 
2.2%
-74.034234 84417
 
2.0%
-74.043845 83397
 
2.0%
-74.03805095 78827
 
1.9%
-74.0557013 78278
 
1.9%
-74.04630454 74426
 
1.8%
Other values (136905) 3100238
73.7%
ValueCountFrequency (%)
-74.0969366 520
< 0.1%
-74.08896387 2
 
< 0.1%
-74.0887723 717
< 0.1%
-74.08801228 317
< 0.1%
-74.08722293 1
 
< 0.1%
-74.08714485 1
 
< 0.1%
-74.08706903 1
 
< 0.1%
-74.0868541 1
 
< 0.1%
-74.08684921 1
 
< 0.1%
-74.08684228 1
 
< 0.1%
ValueCountFrequency (%)
-73.9411734 1
 
< 0.1%
-73.99727857 1
 
< 0.1%
-73.99793313 9
< 0.1%
-73.99886102 1
 
< 0.1%
-74.00098978 2
 
< 0.1%
-74.00336778 1
 
< 0.1%
-74.00455577 1
 
< 0.1%
-74.00700725 1
 
< 0.1%
-74.00907335 1
 
< 0.1%
-74.01260012 4
< 0.1%

end_station_id
Unsupported

REJECTED  UNSUPPORTED 

Missing10023
Missing (%)0.2%
Memory size238.7 MiB
Distinct752
Distinct (%)< 0.1%
Missing10023
Missing (%)0.2%
Memory size328.4 MiB
2024-01-21T14:22:18.653660image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length55
Median length40
Mean length16.925813
Min length4

Characters and Unicode

Total characters71053785
Distinct characters69
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique249 ?
Unique (%)< 0.1%

Sample

1st rowCity Hall
2nd rowHeights Elevator
3rd rowNewark Ave
4th rowDanforth Light Rail
5th rowHamilton Park
ValueCountFrequency (%)
st 2546618
 
17.0%
1871933
 
12.5%
path 490382
 
3.3%
park 473715
 
3.2%
ave 471382
 
3.2%
grove 352490
 
2.4%
hudson 285560
 
1.9%
newport 262625
 
1.8%
rail 252313
 
1.7%
light 252313
 
1.7%
Other values (548) 7701491
51.5%
2024-01-21T14:22:19.443242image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10762885
 
15.1%
t 5162173
 
7.3%
r 4192042
 
5.9%
a 4174103
 
5.9%
e 3817832
 
5.4%
o 3622544
 
5.1%
n 3557476
 
5.0%
S 3138965
 
4.4%
i 3018923
 
4.2%
l 2437399
 
3.4%
Other values (59) 27169443
38.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 43527447
61.3%
Uppercase Letter 13386882
 
18.8%
Space Separator 10762885
 
15.1%
Decimal Number 1503180
 
2.1%
Other Punctuation 1341569
 
1.9%
Dash Punctuation 531787
 
0.7%
Open Punctuation 17
 
< 0.1%
Close Punctuation 17
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 5162173
11.9%
r 4192042
9.6%
a 4174103
9.6%
e 3817832
 
8.8%
o 3622544
 
8.3%
n 3557476
 
8.2%
i 3018923
 
6.9%
l 2437399
 
5.6%
s 1835305
 
4.2%
k 1349348
 
3.1%
Other values (15) 10360302
23.8%
Uppercase Letter
ValueCountFrequency (%)
S 3138965
23.4%
H 1600574
12.0%
P 1550870
11.6%
A 1025662
 
7.7%
M 698645
 
5.2%
C 696110
 
5.2%
T 686506
 
5.1%
G 571738
 
4.3%
L 543619
 
4.1%
W 516189
 
3.9%
Other values (15) 2358004
17.6%
Decimal Number
ValueCountFrequency (%)
1 624688
41.6%
4 203863
 
13.6%
6 183433
 
12.2%
2 130169
 
8.7%
5 80346
 
5.3%
8 72382
 
4.8%
9 65460
 
4.4%
7 61894
 
4.1%
3 55366
 
3.7%
0 25579
 
1.7%
Other Punctuation
ValueCountFrequency (%)
& 1340167
99.9%
' 1384
 
0.1%
\ 17
 
< 0.1%
. 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
10762885
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 531787
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 56914329
80.1%
Common 14139456
 
19.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 5162173
 
9.1%
r 4192042
 
7.4%
a 4174103
 
7.3%
e 3817832
 
6.7%
o 3622544
 
6.4%
n 3557476
 
6.3%
S 3138965
 
5.5%
i 3018923
 
5.3%
l 2437399
 
4.3%
s 1835305
 
3.2%
Other values (40) 21957567
38.6%
Common
ValueCountFrequency (%)
10762885
76.1%
& 1340167
 
9.5%
1 624688
 
4.4%
- 531787
 
3.8%
4 203863
 
1.4%
6 183433
 
1.3%
2 130169
 
0.9%
5 80346
 
0.6%
8 72382
 
0.5%
9 65460
 
0.5%
Other values (9) 144276
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 71053785
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10762885
 
15.1%
t 5162173
 
7.3%
r 4192042
 
5.9%
a 4174103
 
5.9%
e 3817832
 
5.4%
o 3622544
 
5.1%
n 3557476
 
5.0%
S 3138965
 
4.4%
i 3018923
 
4.2%
l 2437399
 
3.4%
Other values (59) 27169443
38.2%

end_station_latitude
Real number (ℝ)

SKEWED 

Distinct867
Distinct (%)< 0.1%
Missing5156
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean40.720227
Minimum0
Maximum40.872412
Zeros787
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size64.2 MiB
2024-01-21T14:22:19.731729image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40.712774
Q140.719252
median40.72534
Q340.736068
95-th percentile40.749985
Maximum40.872412
Range40.872412
Interquartile range (IQR)0.016815956

Descriptive statistics

Standard deviation0.55739221
Coefficient of variation (CV)0.013688337
Kurtosis5330.7391
Mean40.720227
Median Absolute Deviation (MAD)0.0076074254
Skewness-73.009925
Sum1.7113983 × 108
Variance0.31068607
MonotonicityNot monotonic
2024-01-21T14:22:20.019707image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.71958612 299334
 
7.1%
40.72759597 153663
 
3.7%
40.7272235 137393
 
3.3%
40.7287448 123975
 
2.9%
40.71458404 101681
 
2.4%
40.7177325 92844
 
2.2%
40.73606766 92298
 
2.2%
40.7112423 91751
 
2.2%
40.73698222 91245
 
2.2%
40.7192517 89750
 
2.1%
Other values (857) 2928887
69.6%
ValueCountFrequency (%)
0 787
< 0.1%
40.64 1
 
< 0.1%
40.64507 1
 
< 0.1%
40.64621 1
 
< 0.1%
40.646377 1
 
< 0.1%
40.649292 1
 
< 0.1%
40.64958 1
 
< 0.1%
40.65 4
 
< 0.1%
40.66 3
 
< 0.1%
40.6630619 2
 
< 0.1%
ValueCountFrequency (%)
40.872412 1
 
< 0.1%
40.86448 1
 
< 0.1%
40.863124 1
 
< 0.1%
40.86156 1
 
< 0.1%
40.861382 1
 
< 0.1%
40.852252 1
 
< 0.1%
40.85168 7
 
< 0.1%
40.85 2
 
< 0.1%
40.849972 21
< 0.1%
40.848467 1
 
< 0.1%

end_station_longitude
Real number (ℝ)

SKEWED 

Distinct870
Distinct (%)< 0.1%
Missing5156
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean-74.028609
Minimum-74.19
Maximum0
Zeros787
Zeros (%)< 0.1%
Negative4202034
Negative (%)99.9%
Memory size64.2 MiB
2024-01-21T14:22:20.292183image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-74.19
5-th percentile-74.066921
Q1-74.047727
median-74.042817
Q3-74.033552
95-th percentile-74.027781
Maximum0
Range74.19
Interquartile range (IQR)0.014174725

Descriptive statistics

Standard deviation1.0131829
Coefficient of variation (CV)-0.013686369
Kurtosis5333.808
Mean-74.028609
Median Absolute Deviation (MAD)0.0081729438
Skewness73.041444
Sum-3.1112899 × 108
Variance1.0265395
MonotonicityNot monotonic
2024-01-21T14:22:20.552957image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-74.04311746 299334
 
7.1%
-74.04424731 153663
 
3.7%
-74.0337589 137393
 
3.3%
-74.0321082 123975
 
2.9%
-74.04281706 101681
 
2.4%
-74.043845 93325
 
2.2%
-74.02912706 92298
 
2.2%
-74.0557013 91751
 
2.2%
-74.02778059 91245
 
2.2%
-74.034234 90283
 
2.1%
Other values (860) 2927873
69.6%
ValueCountFrequency (%)
-74.19 1
 
< 0.1%
-74.18 1
 
< 0.1%
-74.16 3
 
< 0.1%
-74.15 4
 
< 0.1%
-74.14 4
 
< 0.1%
-74.13 1
 
< 0.1%
-74.12 6
 
< 0.1%
-74.11 20
 
< 0.1%
-74.1 42
 
< 0.1%
-74.0969366 568
< 0.1%
ValueCountFrequency (%)
0 787
< 0.1%
-73.888271 1
 
< 0.1%
-73.888719 1
 
< 0.1%
-73.89 1
 
< 0.1%
-73.89081 2
 
< 0.1%
-73.891677 1
 
< 0.1%
-73.892103 1
 
< 0.1%
-73.89522 1
 
< 0.1%
-73.896599 2
 
< 0.1%
-73.89795 1
 
< 0.1%

bike_id
Real number (ℝ)

MISSING 

Distinct3116
Distinct (%)0.2%
Missing2517155
Missing (%)59.8%
Infinite0
Infinite (%)0.0%
Mean30547.503
Minimum14529
Maximum49985
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size64.2 MiB
2024-01-21T14:22:20.788727image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum14529
5-th percentile24475
Q126220
median29302
Q331710
95-th percentile44225
Maximum49985
Range35456
Interquartile range (IQR)5490

Descriptive statistics

Standard deviation6280.7493
Coefficient of variation (CV)0.20560598
Kurtosis0.32846665
Mean30547.503
Median Absolute Deviation (MAD)3054
Skewness1.1727291
Sum5.1650391 × 1010
Variance39447812
MonotonicityNot monotonic
2024-01-21T14:22:21.014280image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
26159 2704
 
0.1%
26260 2694
 
0.1%
26170 2678
 
0.1%
26236 2675
 
0.1%
26161 2674
 
0.1%
26306 2667
 
0.1%
26431 2658
 
0.1%
26192 2638
 
0.1%
26213 2628
 
0.1%
26286 2617
 
0.1%
Other values (3106) 1664189
39.5%
(Missing) 2517155
59.8%
ValueCountFrequency (%)
14529 3
 
< 0.1%
14531 45
 
< 0.1%
14536 20
 
< 0.1%
14552 10
 
< 0.1%
14556 34
 
< 0.1%
14578 3
 
< 0.1%
14585 104
< 0.1%
14598 59
 
< 0.1%
14607 178
< 0.1%
14632 1
 
< 0.1%
ValueCountFrequency (%)
49985 6
 
< 0.1%
49734 3
 
< 0.1%
49527 14
 
< 0.1%
49081 20
 
< 0.1%
49058 5
 
< 0.1%
48932 96
< 0.1%
48930 63
< 0.1%
48929 61
< 0.1%
48924 92
< 0.1%
48923 68
< 0.1%

user_type
Categorical

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing2517652
Missing (%)59.8%
Memory size274.2 MiB
Subscriber
1481426 
Customer
208899 

Length

Max length10
Median length10
Mean length9.7528298
Min length8

Characters and Unicode

Total characters16485452
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSubscriber
2nd rowSubscriber
3rd rowSubscriber
4th rowSubscriber
5th rowSubscriber

Common Values

ValueCountFrequency (%)
Subscriber 1481426
35.2%
Customer 208899
 
5.0%
(Missing) 2517652
59.8%

Length

2024-01-21T14:22:21.261361image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-21T14:22:21.503572image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
subscriber 1481426
87.6%
customer 208899
 
12.4%

Most occurring characters

ValueCountFrequency (%)
r 3171751
19.2%
b 2962852
18.0%
s 1690325
10.3%
u 1690325
10.3%
e 1690325
10.3%
S 1481426
9.0%
c 1481426
9.0%
i 1481426
9.0%
C 208899
 
1.3%
t 208899
 
1.3%
Other values (2) 417798
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14795127
89.7%
Uppercase Letter 1690325
 
10.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 3171751
21.4%
b 2962852
20.0%
s 1690325
11.4%
u 1690325
11.4%
e 1690325
11.4%
c 1481426
10.0%
i 1481426
10.0%
t 208899
 
1.4%
o 208899
 
1.4%
m 208899
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
S 1481426
87.6%
C 208899
 
12.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 16485452
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 3171751
19.2%
b 2962852
18.0%
s 1690325
10.3%
u 1690325
10.3%
e 1690325
10.3%
S 1481426
9.0%
c 1481426
9.0%
i 1481426
9.0%
C 208899
 
1.3%
t 208899
 
1.3%
Other values (2) 417798
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16485452
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 3171751
19.2%
b 2962852
18.0%
s 1690325
10.3%
u 1690325
10.3%
e 1690325
10.3%
S 1481426
9.0%
c 1481426
9.0%
i 1481426
9.0%
C 208899
 
1.3%
t 208899
 
1.3%
Other values (2) 417798
 
2.5%

birth_year
Real number (ℝ)

MISSING 

Distinct82
Distinct (%)< 0.1%
Missing2560677
Missing (%)60.9%
Infinite0
Infinite (%)0.0%
Mean1980.5658
Minimum1887
Maximum2004
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size64.2 MiB
2024-01-21T14:22:21.722989image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1887
5-th percentile1960
Q11974
median1983
Q31988
95-th percentile1994
Maximum2004
Range117
Interquartile range (IQR)14

Descriptive statistics

Standard deviation10.347948
Coefficient of variation (CV)0.0052247435
Kurtosis1.402828
Mean1980.5658
Median Absolute Deviation (MAD)6
Skewness-0.87854412
Sum3.262586 × 109
Variance107.08003
MonotonicityNot monotonic
2024-01-21T14:22:21.993165image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1969 110638
 
2.6%
1989 84411
 
2.0%
1986 83332
 
2.0%
1988 82271
 
2.0%
1987 75599
 
1.8%
1984 72789
 
1.7%
1990 71698
 
1.7%
1983 66868
 
1.6%
1985 64856
 
1.5%
1981 63137
 
1.5%
Other values (72) 871701
 
20.7%
(Missing) 2560677
60.9%
ValueCountFrequency (%)
1887 71
 
< 0.1%
1888 260
< 0.1%
1889 4
 
< 0.1%
1900 65
 
< 0.1%
1901 1
 
< 0.1%
1904 19
 
< 0.1%
1905 1
 
< 0.1%
1918 2
 
< 0.1%
1920 1
 
< 0.1%
1923 3
 
< 0.1%
ValueCountFrequency (%)
2004 26
 
< 0.1%
2003 291
 
< 0.1%
2002 677
 
< 0.1%
2001 1490
 
< 0.1%
2000 2788
 
0.1%
1999 2873
 
0.1%
1998 5880
 
0.1%
1997 7879
 
0.2%
1996 16611
0.4%
1995 20766
0.5%

gender
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing2517155
Missing (%)59.8%
Memory size263.3 MiB
1.0
1168398 
2.0
375403 
0.0
147021 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters5072466
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 1168398
27.8%
2.0 375403
 
8.9%
0.0 147021
 
3.5%
(Missing) 2517155
59.8%

Length

2024-01-21T14:22:22.206361image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-21T14:22:22.364282image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 1168398
69.1%
2.0 375403
 
22.2%
0.0 147021
 
8.7%

Most occurring characters

ValueCountFrequency (%)
0 1837843
36.2%
. 1690822
33.3%
1 1168398
23.0%
2 375403
 
7.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3381644
66.7%
Other Punctuation 1690822
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1837843
54.3%
1 1168398
34.6%
2 375403
 
11.1%
Other Punctuation
ValueCountFrequency (%)
. 1690822
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5072466
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1837843
36.2%
. 1690822
33.3%
1 1168398
23.0%
2 375403
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5072466
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1837843
36.2%
. 1690822
33.3%
1 1168398
23.0%
2 375403
 
7.4%

ride_id
Text

MISSING 

Distinct2517155
Distinct (%)100.0%
Missing1690822
Missing (%)40.2%
Memory size258.9 MiB
2024-01-21T14:22:24.865286image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters40274480
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2517155 ?
Unique (%)100.0%

Sample

1st row74A4206E7487CBC9
2nd row58EEE2950FFE01CE
3rd row1429D912C16EEE59
4th rowFE9C5B74167CBCCD
5th rowB88D37626F000BBA
ValueCountFrequency (%)
ba95401d75622ccb 1
 
< 0.1%
f71732ece7cd02cf 1
 
< 0.1%
d4b9012b427b1ba8 1
 
< 0.1%
2a99fcafc6b1c7ce 1
 
< 0.1%
b31e3625915a88e1 1
 
< 0.1%
e0b99a4aaca77034 1
 
< 0.1%
bd52c7872af620e6 1
 
< 0.1%
20bd42adad2ca6a0 1
 
< 0.1%
23467b6f12bca65b 1
 
< 0.1%
cd4d1af21695161a 1
 
< 0.1%
Other values (2517145) 2517145
> 99.9%
2024-01-21T14:22:26.967225image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 2519031
 
6.3%
3 2519025
 
6.3%
2 2518483
 
6.3%
5 2518448
 
6.3%
A 2518241
 
6.3%
1 2517494
 
6.3%
D 2517478
 
6.3%
B 2517225
 
6.3%
F 2517212
 
6.3%
E 2517109
 
6.2%
Other values (6) 15094734
37.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25171441
62.5%
Uppercase Letter 15103039
37.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 2519031
10.0%
3 2519025
10.0%
2 2518483
10.0%
5 2518448
10.0%
1 2517494
10.0%
7 2516992
10.0%
9 2515879
10.0%
4 2515838
10.0%
0 2515428
10.0%
6 2514823
10.0%
Uppercase Letter
ValueCountFrequency (%)
A 2518241
16.7%
D 2517478
16.7%
B 2517225
16.7%
F 2517212
16.7%
E 2517109
16.7%
C 2515774
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 25171441
62.5%
Latin 15103039
37.5%

Most frequent character per script

Common
ValueCountFrequency (%)
8 2519031
10.0%
3 2519025
10.0%
2 2518483
10.0%
5 2518448
10.0%
1 2517494
10.0%
7 2516992
10.0%
9 2515879
10.0%
4 2515838
10.0%
0 2515428
10.0%
6 2514823
10.0%
Latin
ValueCountFrequency (%)
A 2518241
16.7%
D 2517478
16.7%
B 2517225
16.7%
F 2517212
16.7%
E 2517109
16.7%
C 2515774
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40274480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 2519031
 
6.3%
3 2519025
 
6.3%
2 2518483
 
6.3%
5 2518448
 
6.3%
A 2518241
 
6.3%
1 2517494
 
6.3%
D 2517478
 
6.3%
B 2517225
 
6.3%
F 2517212
 
6.3%
E 2517109
 
6.2%
Other values (6) 15094734
37.5%

rideable_type
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing1690822
Missing (%)40.2%
Memory size288.4 MiB
classic_bike
1846473 
electric_bike
528108 
docked_bike
 
142574

Length

Max length13
Median length12
Mean length12.153163
Min length11

Characters and Unicode

Total characters30591394
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdocked_bike
2nd rowdocked_bike
3rd rowdocked_bike
4th rowdocked_bike
5th rowdocked_bike

Common Values

ValueCountFrequency (%)
classic_bike 1846473
43.9%
electric_bike 528108
 
12.6%
docked_bike 142574
 
3.4%
(Missing) 1690822
40.2%

Length

2024-01-21T14:22:27.206285image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-21T14:22:27.359365image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
classic_bike 1846473
73.4%
electric_bike 528108
 
21.0%
docked_bike 142574
 
5.7%

Most occurring characters

ValueCountFrequency (%)
c 4891736
16.0%
i 4891736
16.0%
e 3715945
12.1%
s 3692946
12.1%
k 2659729
8.7%
b 2517155
8.2%
_ 2517155
8.2%
l 2374581
7.8%
a 1846473
 
6.0%
t 528108
 
1.7%
Other values (3) 955830
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28074239
91.8%
Connector Punctuation 2517155
 
8.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 4891736
17.4%
i 4891736
17.4%
e 3715945
13.2%
s 3692946
13.2%
k 2659729
9.5%
b 2517155
9.0%
l 2374581
8.5%
a 1846473
 
6.6%
t 528108
 
1.9%
r 528108
 
1.9%
Other values (2) 427722
 
1.5%
Connector Punctuation
ValueCountFrequency (%)
_ 2517155
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 28074239
91.8%
Common 2517155
 
8.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 4891736
17.4%
i 4891736
17.4%
e 3715945
13.2%
s 3692946
13.2%
k 2659729
9.5%
b 2517155
9.0%
l 2374581
8.5%
a 1846473
 
6.6%
t 528108
 
1.9%
r 528108
 
1.9%
Other values (2) 427722
 
1.5%
Common
ValueCountFrequency (%)
_ 2517155
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30591394
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 4891736
16.0%
i 4891736
16.0%
e 3715945
12.1%
s 3692946
12.1%
k 2659729
8.7%
b 2517155
8.2%
_ 2517155
8.2%
l 2374581
7.8%
a 1846473
 
6.0%
t 528108
 
1.7%
Other values (3) 955830
 
3.1%

member_casual
Categorical

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing1690822
Missing (%)40.2%
Memory size273.6 MiB
member
1694959 
casual
822196 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters15102930
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcasual
2nd rowmember
3rd rowcasual
4th rowcasual
5th rowcasual

Common Values

ValueCountFrequency (%)
member 1694959
40.3%
casual 822196
19.5%
(Missing) 1690822
40.2%

Length

2024-01-21T14:22:27.511356image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-21T14:22:27.659690image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
member 1694959
67.3%
casual 822196
32.7%

Most occurring characters

ValueCountFrequency (%)
m 3389918
22.4%
e 3389918
22.4%
b 1694959
11.2%
r 1694959
11.2%
a 1644392
10.9%
c 822196
 
5.4%
s 822196
 
5.4%
u 822196
 
5.4%
l 822196
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15102930
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 3389918
22.4%
e 3389918
22.4%
b 1694959
11.2%
r 1694959
11.2%
a 1644392
10.9%
c 822196
 
5.4%
s 822196
 
5.4%
u 822196
 
5.4%
l 822196
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 15102930
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
m 3389918
22.4%
e 3389918
22.4%
b 1694959
11.2%
r 1694959
11.2%
a 1644392
10.9%
c 822196
 
5.4%
s 822196
 
5.4%
u 822196
 
5.4%
l 822196
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15102930
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m 3389918
22.4%
e 3389918
22.4%
b 1694959
11.2%
r 1694959
11.2%
a 1644392
10.9%
c 822196
 
5.4%
s 822196
 
5.4%
u 822196
 
5.4%
l 822196
 
5.4%

Interactions

2024-01-21T14:21:18.008372image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:48.705019image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:53.166456image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:58.466341image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:03.999580image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:09.108890image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:14.267154image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:18.534507image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:49.305880image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:54.079716image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:59.613134image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:05.013914image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:10.098252image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:14.831902image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:19.077931image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:49.976735image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:54.897451image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:00.422797image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:05.769459image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:10.887107image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:15.383491image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:19.617388image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:50.599091image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:55.711968image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:01.294019image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:06.531333image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:11.767945image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:15.979175image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:20.129446image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:51.217466image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:56.653191image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:02.177813image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:07.433803image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:12.521369image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:16.520419image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:20.674002image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:51.892698image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:57.256885image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:02.864204image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:08.029393image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:13.114760image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:16.973261image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:21.165237image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:52.530732image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:20:57.834236image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:03.439243image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:08.561830image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:13.675451image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-01-21T14:21:17.500930image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Missing values

2024-01-21T14:21:22.934671image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-21T14:21:34.477182image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-21T14:21:58.446909image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

trip_durationstart_timestop_timestart_station_idstart_station_namestart_station_latitudestart_station_longitudeend_station_idend_station_nameend_station_latitudeend_station_longitudebike_iduser_typebirth_yeargenderride_idrideable_typemember_casual
0148.02017-01-01 00:21:322017-01-01 00:24:013276Marin Light Rail40.714584-74.0428173185City Hall40.717732-74.04384524575.0Subscriber1983.01.0NaNNaNNaN
11283.02017-01-01 00:24:352017-01-01 00:45:583183Exchange Place40.716247-74.0334593198Heights Elevator40.748716-74.04044324723.0Subscriber1978.01.0NaNNaNNaN
2372.02017-01-01 00:38:192017-01-01 00:44:313183Exchange Place40.716247-74.0334593211Newark Ave40.721525-74.04630524620.0Subscriber1989.01.0NaNNaNNaN
31513.02017-01-01 00:38:372017-01-01 01:03:503194McGinley Square40.725340-74.0676223271Danforth Light Rail40.692640-74.08801224668.0Subscriber1961.01.0NaNNaNNaN
4639.02017-01-01 01:47:522017-01-01 01:58:313183Exchange Place40.716247-74.0334593203Hamilton Park40.727596-74.04424726167.0Subscriber1993.01.0NaNNaNNaN
5258.02017-01-01 01:56:292017-01-01 02:00:483186Grove St PATH40.719586-74.0431173270Jersey & 6th St40.725289-74.04557224604.0Subscriber1970.01.0NaNNaNNaN
6663.02017-01-01 02:12:342017-01-01 02:23:383270Jersey & 6th St40.725289-74.0455723206Hilltop40.731169-74.05757424641.0Subscriber1978.01.0NaNNaNNaN
7567.02017-01-01 02:15:522017-01-01 02:25:203192Liberty Light Rail40.711242-74.0557013213Van Vorst Park40.718489-74.04772724513.0Subscriber1970.01.0NaNNaNNaN
8551.02017-01-01 02:16:032017-01-01 02:25:153192Liberty Light Rail40.711242-74.0557013213Van Vorst Park40.718489-74.04772724463.0Subscriber1967.02.0NaNNaNNaN
9573.02017-01-01 02:22:172017-01-01 02:31:503212Christ Hospital40.734786-74.0504443225Baldwin at Montgomery40.723659-74.06419424486.0Subscriber1984.01.0NaNNaNNaN
trip_durationstart_timestop_timestart_station_idstart_station_namestart_station_latitudestart_station_longitudeend_station_idend_station_nameend_station_latitudeend_station_longitudebike_iduser_typebirth_yeargenderride_idrideable_typemember_casual
5382376.02020-10-13 14:58:27.75202020-10-13 14:59:44.49703185City Hall40.717732-74.0438453185City Hall40.717732-74.04384544737.0Subscriber1976.01.0NaNNaNNaN
53824305.02020-10-13 15:00:09.32402020-10-13 15:05:15.14703186Grove St PATH40.719586-74.0431173203Hamilton Park40.727596-74.04424742213.0Subscriber1996.01.0NaNNaNNaN
5382599.02020-10-13 15:00:35.92602020-10-13 15:02:15.31803185City Hall40.717732-74.0438453185City Hall40.717732-74.04384544347.0Subscriber1976.01.0NaNNaNNaN
53826492.02020-10-13 15:03:05.32202020-10-13 15:11:18.06103195Sip Ave40.730897-74.0639133225Baldwin at Montgomery40.723659-74.06419446598.0Customer1998.01.0NaNNaNNaN
53827715.02020-10-13 15:04:16.84002020-10-13 15:16:12.38703185City Hall40.717732-74.0438453269Brunswick & 6th40.726012-74.05038944737.0Subscriber1976.01.0NaNNaNNaN
53828270.02020-10-13 15:05:35.08602020-10-13 15:10:05.34303207Oakland Ave40.737604-74.0524783640Journal Square40.733670-74.06250044744.0Subscriber1963.02.0NaNNaNNaN
53829400.02020-10-13 15:09:03.48902020-10-13 15:15:43.97503209Brunswick St40.724176-74.0506563209Brunswick St40.724176-74.05065645345.0Subscriber1984.01.0NaNNaNNaN
53830206.02020-10-13 15:11:34.35002020-10-13 15:15:00.50303195Sip Ave40.730897-74.0639133194McGinley Square40.725340-74.06762247019.0Subscriber1993.01.0NaNNaNNaN
53831216.02020-10-13 15:11:49.15102020-10-13 15:15:25.69303195Sip Ave40.730897-74.0639133225Baldwin at Montgomery40.723659-74.06419442191.0Subscriber1966.01.0NaNNaNNaN
53832418.02020-10-13 15:12:31.79202020-10-13 15:19:30.52803267Morris Canal40.712419-74.0385263186Grove St PATH40.719586-74.04311747255.0Subscriber1991.01.0NaNNaNNaN

Duplicate rows

Most frequently occurring

trip_durationstart_timestop_timestart_station_namestart_station_latitudestart_station_longitudeend_station_nameend_station_latitudeend_station_longitudebike_iduser_typebirth_yeargenderride_idrideable_typemember_casual# duplicates
061.02020-10-07 14:59:28.82502020-10-07 15:00:30.2640Sip Ave40.730897-74.063913Sip Ave40.730897-74.06391342535.0Subscriber1977.02.0NaNNaNNaN2
161.02020-10-08 16:56:51.92602020-10-08 16:57:53.2300Brunswick & 6th40.726012-74.050389Brunswick St40.724176-74.05065633887.0Subscriber1987.01.0NaNNaNNaN2
262.02020-10-04 11:39:28.52102020-10-04 11:40:31.3950Warren St40.721124-74.038051Warren St40.721124-74.03805142195.0Subscriber1970.02.0NaNNaNNaN2
362.02020-10-05 18:18:44.03702020-10-05 18:19:46.3250Jersey & 6th St40.725289-74.045572Jersey & 3rd40.723332-74.04595333859.0Subscriber1977.01.0NaNNaNNaN2
462.02020-10-08 12:25:14.06602020-10-08 12:26:16.5070City Hall40.717732-74.043845City Hall40.717732-74.04384533859.0Subscriber1972.01.0NaNNaNNaN2
563.02020-10-07 19:00:46.08902020-10-07 19:01:49.2280Newport Pkwy40.728745-74.032108Newport Pkwy40.728745-74.03210842460.0Subscriber1986.01.0NaNNaNNaN2
664.02020-10-12 20:41:50.86802020-10-12 20:42:55.3970Warren St40.721124-74.038051Warren St40.721124-74.03805144687.0Customer1987.01.0NaNNaNNaN2
766.02020-10-04 19:07:55.70002020-10-04 19:09:02.2950Monmouth and 6th40.725685-74.048790Monmouth and 6th40.725685-74.04879044418.0Subscriber1999.01.0NaNNaNNaN2
867.02020-10-09 16:21:35.38002020-10-09 16:22:42.4140Newport PATH40.727224-74.033759Newport PATH40.727224-74.03375945366.0Subscriber1984.01.0NaNNaNNaN2
968.02020-10-04 10:25:26.59102020-10-04 10:26:35.1320Warren St40.721124-74.038051Columbus Drive40.718355-74.03891445150.0Subscriber1991.01.0NaNNaNNaN2